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Abstract. Linearity tests are randomized algorithms which have oracle access to the 
truth table of some function f, and are supposed to distinguish between linear functions 
and functions which are far from linear. Linearity tests were first introduced by Blum, 
Luby and Rubenfeld in [BLR93], and were later used in the PCP theorem, among other 
applications. The quality of a linearity test is described by its correctness c - the probability 
it accepts linear functions, its soundness s - the probability it accepts functions far from 
linear, and its query complexity q - the number of queries it makes. 

Linearity tests were studied in order to decrease the soundness of linearity tests, while 
keeping the query complexity small (for one reason, to improve PCP constructions). 
Samorodnitsky and Trevisan constructed in [STOO] the Complete Graph Test, and prove 
that no Hyper Graph Test can perform better than the Complete Graph Test. Later in 
[ST06] they prove, among other results, that no non-adaptive linearity test can perform 
better than the Complete Graph Test. Their proof uses the algebraic machinery of the 
Gowers Norm. A result by Ben-Sasson, Harsha and Raskhodnikova [BHR05] allows to 
generalize this lower bound also to adaptive linearity tests. 

We also prove the same optimal lower bound for adaptive linearity test, but our proof 
technique is arguably simpler and more direct than the one used in [ST06] . We also study, 
like [ST06], the behavior of linearity tests on quadratic functions. However, instead of 
analyzing the Gowers Norm of certain functions, we provide a more direct combinatorial 
proof, studying the behavior of linearity tests on random quadratic functions. This proof 
technique also lets us prove directly the lower bound also for adaptive linearity tests. 



1. Introduction 

We study the relation between the number of queries and soundness of adaptive linearity 
tests. A linearity test (over the field F2 for example) is a randomized algorithm which has 
oracle access to the truth table of a function / : {0, l} n — > {0, 1}, and needs to distinguish 
between the following two extreme cases: 

(1) / is linear 

(2) / is far from linear functions 
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A function / is called linear if it can be written as f(xi,...,x n ) = a\x\ + ... + a n x 
with ai,...,a n G F2. The agreement of two functions f,g : {0, l} n — ► {0,1} is defined as 
d(f,g) = |IPx[/(x) = s(x)] ~ Px[/( x ) 7^ ff( x )]|- / is f ar from linear functions if it has small 
agreement with all linear functions (we make this definition precise in Section [2J . 

Linearity tests were first introduced by Blum, Luby and Rubenfeld in |BLR93| . They 
presented the following test (coined the BLR test), which makes only 3 queries to /: 

(1) Choose x, y G {0, l} n at random 

(2) Verify that /(x + y) = /(x) + /(y). 

Bellare et al. |BCH+96| gave a tight analysis of the BLR test. It is obvious that the 
BLR test always accepts a linear function. They have shown that if the test accepts a 
function / with probability 1/2 + e, then / has agreement at least 2e with some linear 
function. 

For a linearity test, we define that it has completeness c if it accepts any linear function 
with probability of at least c. A test has perfect completeness if c = 1. A linearity test has 
soundness s if it accepts any function / with agreement at most e with all linear functions, 
with probability of at most s + e 1 , where e' —* when e — > 0. We define the query complexity 
q of a test as the maximal number of queries it performs. In the case of the BLR test, it 
has perfect completeness, soundness s = 1/2 (with e' = 2e) and query complexity q = 3. 

If one repeats a linearity test with query complexity q and soundness s independently 
t times, the query complexity grows to q' = qt while the soundness reduces to s' = s*. So, 
it makes sense to define the amortized query complexity q of a test as q = q/log 2 (1/s)- 
Independent repetition of a test doesn't change it's amortized query complexity. Notice 
that the BLR test has amortized query complexity q = 3. 

Linearity tests are a key ingredient in the PCP theorem, started in the works of Arora 
and Safra [AS98] and Arora, Lund, Motwani, Sudan and Szegedy [ALM+98]. In order to 



improve PCP constructions, linearity tests were studied in order to improve their amortized 
query complexity. 

Samorodnitsky and Trevisan [ STOP ] have generalized the basic BLR linearity test. They 
introduced the Complete Graph Test. The Complete Graph Test (on k vertices) is: 

(1) Choose xi, ...,Xfc G {0, 1}™ independently 

(2) Verify /(x< + x,-) = /(x<) + /(x,-) for all i,j 

This test has perfect completeness and query complexity q = („) + k. They show that all 
the (2) tests that the Complete Graph Test performs are essentially independent, i.e. that 

the test has soundness s = 2~(a). This makes this test have amortized query complexity 
q = 1 + 9(l/y/q). They show that this test is optimal among the family of Hyper-Graph 
Tests (see [STOOj for definition of this family of linearity tests), and raise the question of 
whether the Complete Graph Test is optimal among all linearity tests, i.e. does a test with 
the same query complexity but with better soundness exist? 

They partially answer this question in |ST0 6]. where (among many other results) they 
show that no non-adaptive linearity test can perform better than the Complete Graph Test. 
A test is called non- adaptive if it first chooses q locations in the truth table of /, then queries 
them, and based on the results accept or rejects /. Otherwise, a test is called adaptive. An 
adaptive test may decide on its query locations based on the values of / in previous queries. 

The proof technique of [ST06J uses the algebraic analysis of the Gowers Norm of certain 
functions. The Gowers Norm is a measure of local closeness of a function to a low degree 
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polynomial. For more details regarding the definition and properties of the Gowers Norm, 
see |GT05| and |Sam07| . 

Ben-Sasson, Harsha and Raskhodnikova prove in [BHR05] that any adaptive linearity 
test with completeness c, soundness s and query complexity q can be transformed into 
a non-adaptive linearity test with the same query complexity, perfect completeness and 
soundness s' = s + 1 — c. Combining their result with the result of |ST06| proves the lower 
bound also for adaptive linearity tests. 

We also prove the same optimal lower bound for adaptive linearity test, but our proof 
technique is arguably simpler and more direct than the one used in [ST06] . We also study, 
like [ST06J, the behavior of linearity tests on quadratic functions. However, instead of 
employing algebraic analysis of the Gowers Norm of certain functions, we provide a more 
direct combinatorial proof, studying the behavior of linearity tests on random quadratic 
functions. This proof technique also lets us prove directly the lower bound also for adaptive 
linearity tests. 

1.1. Our techniques 

We model adaptive tests using test trees. A test tree T is a binary tree, where in each 
inner vertex v there is some label x(i>) S {0, l} n , and the leaves are labeled with either 
accept or reject. Running a test tree on a function / is done by querying at each stage / 
on the label of the current vertex (starting at the root), and following one of the two edges 
leaving the vertex, depending on the query response. When reaching a leaf, its label {accept 
or reject) is the value of that / gets in T . An adaptive test T can always be modeled as 
first randomly choosing a test tree from some set {Tj}, according to some distribution on 
the test trees, then running the test tree on /. 

It turns out that in order to prove a lower bound which matches the upper bound of the 
Complete Graph Test, it is enough to consider functions / which are quadratic. Actually, 
it's enough to consider / which is a random quadratic function. 



A function / is quadratic if it can be presented as f{x\, x n ) = \^ a i,j x i x j+z, hxi+c 




for some values aij,bt,c € ¥2- We study the behavior of running test trees on a random 
linear function, and on a random quadratic function. 

The main idea is as follows. Let v be some inner vertex in a test tree T, with the path 
from the root of T to v being t>o, v. If x(i>) is linearly dependent on x(t?o), x(ufc_i), 

then when running T on any linear function, the value of /(x(t> )) can be deduced from the 
already known values of /(x(i>o)), /(x(vfc_i)). Therefore, if the vertex v is reached, then 
the same edge leaving v will always be taken by any linear function. Additionally, if x(v) is 
linearly independent of x(t>o), x(u/%_i), then either v is never reached running T on linear 
functions, or the two edges leaving v are taken with equal probability when running T on 
a random linear function. A similar analysis can be made when running T on quadratic 
functions, replacing linear dependence with a corresponding notion of quadratic dependence. 

Using this observation, we can define the linear rank of a leaf v, marked l(v), as the 
linear rank of labels on the path from the root to v. We prove that running the test tree 
T on a random linear function reaches v with probability 2~ l<yV \ Similarly, we define the 
quadratic rank of a leaf v, marked q(v), as the quadratic rank of those labels, and we 
proving that running T on a random quadratic function reaches v with probability 2~ q ( v \ 
We prove that the quadratic rank of any set cannot be much larger than its linear rank, 
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and in particular that q(v) < ( 2 ) + K v ) f° r an leaves v. We use this inequality to prove 
that a test which has completeness c and query complexity q accepts a random quadratic 
function with a probability of at least c — 1 + 2~ q+ ^ q " > , where 4>{q) is defined as the unique 
non-negative solution to + ^0/) = 

We use this to show that any linearity test with completeness c and query complexity 
q must have s > 2 _9+< ^w. In particular, the Complete Graph Test on k vertices has perfect 

completeness, soundness s = 2™ (2) and query complexity q = (*) + Since 0(g) = fe the 
Complete Graph Test is optimal among all adaptive tests with the same query complexity. 

In fact, we prove a stronger claim. We say that a test T has average query complexity 
q if for any function /, the average number of queries performed is at most q. In particular 
any test with query complexity q also has average query complexity q. We prove that for 
any test with completeness c and average query complexity q, the soundness is at least 
s > 

We present and analyze linearity tests over F2. Linearity tests can also be considered 
over larger fields or groups. Our lower bound actually generalizes easily to any finite field, 
but for ease of presentation, and since the techniques are exactly the same, we present 
everything over F2. We comment further on the modifications required for general finite 
fields in Section [2j 

2. Preliminaries 

2.1. Linearity tests 

We call a function / : {0, l} n — > {0,1} linear if it can be written as f(xi,...,x n ) = 
a\X\ + ... + a n x n for some ai, ...,a n S {0, 1} where addition and multiplication are in F2. 

A linearity test is a randomized algorithm with oracle access to the truth table of /, 
which is supposed to distinguish the following two extreme cases: 

(1) / is linear (accept) 

(2) / is e-far from linear functions (reject) 

where the agreement of two functions /, g : {0, 1} — > {0, 1} is defined as d(f, g) = |Pr x [/(x) = 
<?(x)] — Pr x [/(x) 7^ g(x)]|, and / is e-far from linear functions if the agreement it has with 
any linear function is at most e. 

We now follow with some standard definition regarding linearity tests (or more generally, 
property tests) . We say a test has completeness c if for any linear function / the test accepts 
with probability at least c. A test has perfect completeness if c = 1. We say a test has 
soundness s if for any / which is e-far from linear the test accepts with probability at most 
s + e', where e' — > when e — ► (in fact, we talk about a family of linearity tests, for 
n — > 00, but we ignore this subtle point). 

A test is said to have query complexity q if it accesses the truth-table of / at most q times 
(for any choice of it's internal randomness). A test is said to have average query complexity 
q if for any function /, the average number of accesses (over the internal randomness of the 
test) done to the truth table of / is at most q. Obviously, any test with query complexity 
q is also a test with average query complexity q. 

We say a test is non- adaptive if it chooses all the locations it's going to query in the 
truth table of / before reading any of their values. Otherwise, we call the test adaptive. 
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We now turn to model adaptive tests in a way that will be more convenient for our 
analysis. We first define a test tree and running a test tree on a function. 

Definition 2.1. A test tree on functions {0, l} n — * {0, 1} is a rooted binary tree T. On 
each inner vertex of the tree v there is a label x{y) G {0, l} n . On each leaf there is a label 
of either accept or reject. 

Definition 2.2. Running a test tree T on a function f is done as follows. We start at the 
root of the tree vq, read the value of f{x(vo)), and according to the value take the left or 
the right edge leaving vq. We continue in this fashion on inner vertices of T until we reach 
a leaf of T. The value of f in T is the value of the end leaf (i.e. accept or reject), and the 
depth of f in T is the depth of the end vertex of / in T. 

Using these definitions, we can now model adaptive tests. We identify an adaptive test 
T on functions {0, l} n — ► {0, 1} with a distribution of binary trees {T,} (also on functions 
{0, l} n — * {0, 1}). Running the test T on a function / is done by randomly choosing one of 
the trees Tj (according to their distribution), and then running the test tree Tj on /. The 
result of the function / in the test tree is the result the test T returns on /. 

Notice that a test has query complexity q iff all trees Tj has depth at most q, and has 
average query complexity q iff for any function /, the average depth reached in a random 
tree from {Tj} is at most q. 

In order to define our main theorem, we will define the following function. For x > 
define <p(x) as the unique real positive solution to <fi{x) 2 /2 + 4>{x)/2 = x. Notice that for 
positive integer <f)(x), this is the same as ( g ) + 4>{ x ) = x - The following is the main 
theorem of this paper: 

Theorem 2.3. (main theorem) Let T be an adaptive test with completeness c, soundness s 
and average query complexity q > 1. Then s + 1 — c > 2~ q+ ^ q \ 

Notice that for large q, 4>{q) pa \/2q, also yfq < 4>(q) < \/2q, so we get that in particular, 
s + 1 - c > 2-9+ e U / 9). 

The Complete Graph Test was presented in [STOOj . The test (on a graph with k vertices) 
can be described as choosing xj, ...,Xfc at random, and querying / at Xj (for i = l..k) and 
on Xj + Xj (for 1 < i < j < k). The test accepts / if for any i,j 

f(x i ) + f(x j ) + f(x i + x j ) = 
In [STOP] it is proven that the Complete Graph Test has perfect completeness and 
soundness s = 2 — (2). The total number of queries performed is q = k + (^) , so by our 
definitions, k = <p(q) and s = 2~ q+ ^ Q \ We have the following corollary: 

Corollary 2.4. The Complete Graph Test is optimal among all adaptive linearity tests. 

Remark 2.5. We state and prove all results for functions / : {0,1}™ — * {0,1}. In fact, 
the lower bound result on adaptive linearity tests holds for functions / : F n — * ¥ for any 
finite field F, and not just F2, with only minor adjustments to the definitions and proofs. 
We need to make the following modifications: 

(1) Define "e-far from linear functions" for general fields 

(2) Test trees should have \F\ edges leaving each edge instead of 2 

(3) The proof that random quadratic functions are far from linear, proved in Section [5j 
should be slightly modified 
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Since the results follow simply for any finite field, we chose to present the results over F2 
to make the presentation simpler and clearer. 

3. Quadratic functions 

We will see that in order to prove Theorem 12.31 it will be enough to limit the functions 
/ to be quadratic. We say a function / is quadratic if it can be written as: 



for some ajj, bi,c £ ¥2. 

In fact, for our usage, we will force our quadratic functions / to have /(0) = (equiv- 
alently, c = in the above description). So, throughout this paper, when speaking of 
quadratic functions, we actually speak of quadratic functions / with the added condition 



We will study the dynamics of a test tree T in a linearity test T, in two cases - when 
applied to a uniformly random linear function, and when applied to a uniformly random 
quadratic function. 

The following technical lemma is the key ingredient to the proof of the Theorem 12.31 

Lemma 3.1. Let T be an adaptive linearity test with completeness c and average query com- 
plexity q. Then running T on a random quadratic function returns accept with probability 
at least c - l + 2~ . 

In order to prove Theorem 12.31 we will also need the following simple lemma: 

Lemma 3.2. Let f be a random quadratic function. Then the probability that f is not 
2~^( n ) .f ar from linear functions is 2~^( n ). 

Theorem 12.31 now follows directly from Lemmas 13.11 and 13.21 We sketch now it's proof 
following the two lemmas. 

Proof, (of the main theorem) The average probability that T returns accept on a random 
quadratic function which is 2~^ ( - Tl )-far from linear functions is at least c— l+2~ 9+< ^W — 2 _r2 ( n ). 
So, there exists some quadratic function / which is 2~^( n )-far from linear and on which T 

returns accept with probability at least c — 1 + 2^ q+ ^ <yC ^ — 2~^ n \ Taking n — > 00 shows 
that s + l-c> 2" 1+ ^). a 

The remainder of the paper is organized as follows. Lemma 13. II is proved in Section [U 
and Lemma 13.21 is proved in Section [5j 

4. Linearity test applied to a random quadratic function 

We study tests and test trees applied to linear and quadratic functions, in order to prove 
Lemma l3Tl Let T be an adaptive test with completeness c and average query complexity 
q. Let T be a some test tree which is a part of the test T. 

We start by studying the dynamics of applying T to linear functions. Assume we know 
that / is a linear function, and we are at some vertex v £ T, where the path from the root 
to v is t>o, ..,Vk-x,v. Assume x(v) is linearly dependant on x(uo), ...,x(i;fc_i). Since we know 
/ is linear, we can deduce the value of x(v) from x(vq), ...,x(vk-i), and so we will always 




/(0) = 0. 
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follow the same edge leaving v when we apply T to any linear function. On the other hand, 
if x(u) is linearly independent of x(vo), x(vj._i), we know that when we apply T to a 
random linear function, either we never reach v, or we have equal chances of taking any of 
the two edges leaving v. 

This gives rise to the following formal definition: 

Definition 4.1. Let v be a leaf in T, where the path from the root to v is vq,vi, v^-i, v. 
We define the linear degree of v, marked l(v), to be the linear rank of x(-uo), x(^fc-i)- 

We define Lt to be the set of leaves of T to which linear functions can arrive, i.e, v £ L 
if the path from the root to v, vo, i^-i, v always takes the "correct" edge leaving any 
vertex V{ with x(uj) linearly dependent on x(t>o), x(t>j_i). 

The following lemma formalizes the discussion above: 

Lemma 4.2. For any test tree T: 

(1) For any v G Lt, the probability that a random linear function will arrive to v is 
2 -l(v) 

(2) 2 ~ 1{V) = 1 

v&Lt 

For v G Lt, we define c{v ) to be 1 if the value of v is accept, and c{v ) = otherwise. 
Since the completeness of T is c, we have that the probability that T returns accept on 
a random linear function is at least c. On the other hand, for any test tree T in T, the 
probability that a random linear function will return accept is exactly ^ c(v)2~ l( - v K So, 

the following lemma follows: 
Lemma 4.3. E T ^ c(v)2' 1 ^ > c 

v&Lt 

where by here and throughout the paper we mean the average value of a random 
test tree T in T. 

We now generalize the concept of linear dependence to quadratic functions. 

Definition 4.4. Let xi,...,Xfc G {0,1}™. 

(1) We say xi,...,Xk are quadratically dependent if there are constants ai,...,Ofc £ ¥2, 
not all zero, s.t. for any quadratic function / we have: ai/(xi) + ... + ajt/(xfe) = 0. 
otherwise will call X\,...,Xk quadratically independent. 

(2) We say x^ is quadratically dependent on xi, Xk~\ if there are constants a\, ak-\ G 
F2 s.t. for any quadratic function / we have: /(x^) = ai/(xi) + ... + afc_i/(xjt_i). 
Otherwise we say x& is quadratically independent of xi, ...,x&_i. 

(3) We define the quadratic dimension of xi, ...,x& to be the size of the largest subset 
of {xi, ...,Xfe} which is quadratically independent. 

This definition may seem obfuscated, but the following alternative yet equivalent def- 
inition will clarify it. The space of quadratic functions over {0, l} n is a linear space over 
F2. Let M be it's generating matrix, i.e. the rows of M are a base for the linear space (in 
particular, the dimensions of M are +n) x 2 n ). A column of M corresponds to an input 
x G {0, l} n . Now, Xi,...,Xfc are quadratically dependent iff the columns corresponding to 
them are linearly dependent, and similarly for the other definitions. 
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Notice that the usual definition of linear dependence is equivalent to this more complex 
definition, when applied to the linear space of all linear functions. 

We now can repeat the informal discussion at the start of this section, except this 
time for quadratic functions, with all the reasoning left intact. Let v G T be a vertex, 
with path from the root being vq, Vk-i, v. Assume x(w) is quadratically dependent on 
x(i;o), x(ufc_i), and / is any quadratic function. The value of f(x(v)) can be deduced 
from the already known values of /(x(vo)), /(x(ufc_i)), and so only one edge leaving v 
will be taken on all quadratic functions. Alternatively, if x(v) is quadratically independent 
on x(wo), x(ufe_i), then a random quadratic function either never reaches v, or has equal 
chances of taking each of the two edges leaving v. 

This leads to the following definition and lemma for quadratic degree of a vertex v G T, 
similar to the ones for linear degree. 

Definition 4.5. Let v be a leaf in T, where the path from the root to v is vq, vi, Vk-i,v. 
We define the quadratic degree of v, marked q(v), to be the quadratic rank of x(vo), x(vk-i) 

We define Qt to be the set of leaves of T to which quadratic functions can arrive. 
Naturally Lt Q Qt- The following lemma on quadratic degree follows from the discussion 
above: 

Lemma 4.6. For any test tree T: 

(1) For any v G Qt, the probability that a random quadratic function will arrive to v is 

(2) 2 ~ 9{v) = 1 

veQ 

(3) For any v G we have q{v) > l(v) 

Last, we mark the depth of a vertex v G T by d{v). Since T has average query complexity 
q, we know that for any function /, the average depth of running a random tree T of T on / 
is at most q. So, this also holds for a random linear function. However, the average depth a 
random linear function arrives on a tree T is exactly y~] d(v)2~ 1 ^ , so the following lemma 
follows. 

Lemma 4.7. E T ^ d(v)2~ l W < q 

v£Lt 

We now wish to make a connection between q(v) and l(v) for vertices v G Lt- 
First, we prove that following lemma: 

Lemma 4.8. For any Xi,...,Xk G {0,1}™ there are coefficients aij,bi G F2 s.t. for any 
quadratic function f we have: 

f(xi + ... + Xk) = ^ a id f(Xi + Xj) + b if( X i) 

i,j i 

Proof. Let /(x) by some polynomial of degree d. It's derivative in the y direction is defined 
to be / y (x) = /(x + y) — /(x). It's easy to see that the degree of f y as a function of x is at 
most d — 1. So, taking 3 derivatives from a quadratic function makes it the zero function, 
and so in particular for any quadratic function f, we we take it's derivatives in directions 
x, y and z, and evaluate the result at 0, we get that 

(((/x)y)z(0)=0 
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Opening this expression yields: 

/(x + y + z) - /(x + y) - /(x + z) - /(y + z) + /(x) + /(y) + /(z) - /(0) = 

Since /(0) = 0, we can express /(x + y + z) as a sum of application of / on an element, 
or sum of two elements in {x,y, z}. This proves the lemma for k = 3. For k > 3 we use 
simple induction. ■ 

Now we can bound l{v) in term of q(v). We first prove a result bounding in general the 
linear rank of a set by it's quadratic rank. 

Lemma 4.9. Let {xi, x^} be elements in {0,1}™. Let I be the their linear rank, and q 
their quadratic rank. Then 

Proof. Let S C {xi,...,xj.} be a maximal quadratic independent set. \S\ = q. The linear 
rank of S is also /. Let S' C S be a maximal set of linearly independent elements of S. 
\S'\ = I. Assume w.l.o.g that S' = {xi, ...,x;}. Since every x G S is linearly dependent on 
S' , it can be written as a sum of some of the elements of S'. Using Lemma 14.81 we get that 

(x) (x) 

for any x G S there exists coefficients a\ ■ , b\ G F2 s.t for any quadratic function /: 

/(*) = E «5 ) /(x i +x J )+ £ &i x) /(*i) 

i<«<i<' l<i<i 
We have assumed that all the elements of S 1 are quadratically independent. For this to 
hold, the above equations in the symbolic variables /(xj + Xj) and /(xj) must be linearly 
independent. So the number of equations q must be at most the number of variables, which 
is (2) + I- So, we get that: 

q=\S\<([)+l 



Lemma 4.10. For any leaf v G Lt, l(v) > <fi{q{v)) 

Proof. Let Vq, ...,Vk-i,v be the path in T from the root to v. Let Xj = x(uj) for i = 0..fc— 1. 
Apply lemma |4"U1 on {xo, ...,Xfc_i} to get that q(v) < 0%) + l(v). Reversing this formula, 
since 4>(x) is monotone, we get that l(v) > 4>{q(v)). m 

We can now prove our main technical lemma (Lemma l3.ip . We start with some technical 
lemmas. We define ifj(x) to be x — <fi{x) for x > 1, and for x < 1. Notice that ip is 
continuous, and ip(x) = x — 4>(x) for any non-negative integer x. Hence, using Lemma 14.101 
we get that: 

Lemma 4.11. For any vertex v in a tree T , q(v) — l(v) < ip(q(v)). 
Lemma 4.12. ip is increasing and convex. 

Proof. Since ip is continuous and constant for x < 1, it's enough to prove the claim for 
x > 1 (for increasing it's clear, and once we've proved ip is increasing, it shows it's enough 
to prove convexity for x > 1). We first show ip is increasing. 

For x > 1, define y = (p(x), so x = y 2 /2 + y/2 and ip(y) = y 2 /2 — y/2. 

dip _ <hp_dy_ _ jf _ y - 1/2 
dy dx 4jE y + 1/2 
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If x > 1 then y = (j){x) > 1, hence > for x > 1, and so ^ is increasing. 
To show that ip in convex, 



, / j/-l/2\ 

d 2 ^ _ d (.Ih^J dy _ 1 
dx 2 dy dx (y + 1/2) 3 



> 



We are now finally ready to prove Lemma 13. 11 

Proof, (of Lemma 13 . 1 [) We need to prove that any test T with completeness c and average 
query complexity q > 1 accepts a random quadratic function with probability at least 
c — 1 + 2~^^ . Let us mark the probability the test accepts a random quadratic function 
by p. Let pt mark the probability that a tree T accepts a random quadratic function, px 
is at least the probability that a random quadratic function reaches a leaf in Lt which is 
labeled accept. So: 

PT> Yl C ( V ) 2 ~ 9(V) 

We now follow to analyze p = Et [pt] ■ 

p>E T [Y^ c(v)2- q ^} = Et[ 2- , Mc(u)2-«M +, M] 

We divide the sum in the right side into two parts, po — Pi, with po>Pi > 0, where: 

p = E T [ Y 2- l{ - v) 2- q{v)+l{v) ] 

. and 

pi = Et[ 2~'W(1 - c{v))2- q ^ +l ^} 

v£Lt 

We start by analyzing p\. Since for any v always q(v) > l(v) we have: 

Pl <E T []T 2- J M(l- c ( v ))] 

Recall that by Lemma 14.61 for any tree T we have 

Y 2-'W = 1 

and by Lemma 14.31 we have 

E T [^ 2- i W C (^)] > c 

so we conclude that: 

Pi < 1 — c 

We move to analyze po- Since E r [J^ 2 _, W] = 1 and since the function X — > 2 is 

concave, we have by Jensen's inequality that: 

e t [^ 2" , W(- g (t;) + K«))] 
p > 2 ueZ T 
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Now, we have that q(v) — l(v) < ip(q(v)) by Lemma 14.121 and also by the same lemma, 
since q(v) < d(v), we get ip(q(v)) < ip(d(v)). So we get: 

E T [ 2- 1{v \q(v) - l(v))] < Er[ 2~ l ^mv))] 

Since by Lemma 14.121 ?/? is convex, we get that again by Jensen's inequality we get that 
this is at most ip(E T [ ^ 2~ l ^d(v)]). By Lemma KH 

E t [^2 2~ l(v) d(v)] < q 

where q is the average query complexity of T. So, we conclude that po > 2~^ q \ and in 
total 

P > Po ~ Pi > ^ {q) + c - 1 



5. Random quadratic function is far from linear 

In this section we prove Lemma 13.21 i.e. that a random quadratic function is far from 
linear. We will use commonly known facts about quadratic functions. 
Any quadratic function can be written as: 

/(x) = x*ylx+ < x, b > 

The correlation of / with some linear function g is the 5-th Fourier coefficient of /. 
The Fourier coefficients of quadratic functions are well studied. In particular, it is known 
that all the Fourier coefficients of / have the same absolute value, and that the number 
of non-zero Fourier coefficients is 2 rank ^ A+A \ So, in order to show that / has no large 
correlation with some linear function, it's enough to show that B = A + A 1 has high rank. 
In particular, in order to show that / is 2 - ^( n )-far from linear functions, we need to show 
that B has rank J7(n). We will show that the probability that a random quadratic function 
has rank less than n/4 is 2~ n ( n \ We will use the following lemma: 

Lemma 5.1. The number of matrices of rank at most k is at most n k 2 nk . 

Using Lemma 15. 1( it's easy to prove Lemma 13.21 The number of matrices of rank 
at most n/4 is at most 2 n / 4 ( 1+ °( 1 )). For a random quadratic function, B is a random 
symmetric matrix with zero diagonal, and so the probability that B has rank less than n/4 
is 2" n2 / 4 ( 1 +°( 1 )) = 2~ n ( n \ 

Now we finish by proving Lemma 15. 11 

Proof. Let B be a matrix of rank at most k. There are (^) options to choose k rows which 
span the row span of the matrix, each other row have at most 2 k options since it must be 
in the row span of k specific rows. So, the number of possibilities for rank k matrices is at 
most: 
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